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Abstract 

We  consider  the  problem  of  monitoring  spatial  phenomena,  such  as  road  speeds  on  a  highway,  us¬ 
ing  wireless  sensors  with  limited  battery  life.  A  central  question  is  to  decide  where  to  locate  these 
sensors  to  best  predict  the  phenomenon  at  the  unsensed  locations.  However,  given  the  power  con¬ 
straints,  we  also  need  to  determine  when  to  selectively  activate  these  sensors  in  order  to  maximize 
the  performance  while  satisfying  lifetime  requirements.  Traditionally,  these  two  problems  of  sensor 
placement  and  scheduling  have  been  considered  separately  from  each  other;  one  first  decides  where 
to  place  the  sensors,  and  then  when  to  activate  them. 

In  this  paper,  we  present  an  efficient  algorithm,  eSPASS,  that  simultaneously  optimizes  the  place¬ 
ment  and  the  schedule.  We  prove  that  eSPASS  provides  a  constant-factor  approximation  to  the 
optimal  solution  of  this  NP-hard  optimization  problem.  A  salient  feature  of  our  approach  is  that 
it  obtains  “balanced”  schedules  that  perform  uniformly  well  over  time,  rather  than  only  on  aver¬ 
age.  We  then  extend  the  algorithm  to  allow  for  a  smooth  power-accuracy  tradeoff.  Our  algorithm 
applies  to  complex  settings  where  the  sensing  quality  of  a  set  of  sensors  is  measured,  e.g.,  in  the 
improvement  of  prediction  accuracy  (more  formally,  to  situations  where  the  sensing  quality  function 
is  submodular) .  We  present  extensive  empirical  studies  on  several  sensing  tasks,  and  our  results 
show  that  simultaneously  placing  and  scheduling  gives  drastically  improved  performance  compared 
to  separate  placement  and  scheduling  (e.g.,  a  33%  improvement  in  network  lifetime  on  the  traffic 
prediction  task). 


*  School  of  Computer  Science,  Carnegie  Mellon  University,  Pittsburgh,  PA,  USA 
^  EECS,  University  of  California,  Berkeley,  CA,  USA. 
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Figure  1:  Sensys  Networks  wireless  traffic  sensor,  (left)  encased  unit,  (middle)  sensor  deployed  in 
pavement,  (right)  GPRS/CDMA  base  station. 

1  Introduction 

When  monitoring  spatial  phenomena,  such  as  road  speeds  on  a  highway,  deciding  where  to  place  a 
small  number  of  sensors  to  obtain  best  prediction  accuracy  is  an  important  task.  Figure  1  shows 
a  Sensys  Networks  wireless  traffic  sensor  (Haoui  et  al.,  2008),  that  provides  30  second  aggregate 
speed,  flow  and  vehicle  density  measurements.  Currently  the  system  is  being  deployed  by  Caltrans 
at  different  sites  in  California,  including  highways  and  arterial  roads.  When  using  such  wireless 
sensor  networks,  power  consumption  is  a  key  constraint,  since  every  measurement  drains  the  battery. 
For  applications  such  as  road  speed  monitoring,  a  minimum  battery  lifetime  is  required  to  ensure 
feasibility  of  the  sensor  network  deployment.  One  approach  to  meeting  such  lifetime  requirements  is 
to  deploy  few  nodes  with  large  batteries.  However,  such  an  approach  can  be  sensitive  to  node  failures. 
Additionally,  packaging  constraints  can  limit  the  size  of  the  battery  deployed  with  the  nodes.  For 
these  and  other  reasons,  it  can  be  more  effective  to  deploy  a  larger  number  of  nodes  with  smaller 
batteries,  that  are  activated  only  a  fraction  of  the  time.  Hence,  to  improve  the  lifetime  of  such  a 
sensor  network,  the  problem  of  scheduling  becomes  of  crucial  importance:  Given  a  fixed  placement 
of  sensors,  when  should  we  turn  each  sensor  on  in  order  to  obtain  high  monitoring  performance  over 
all  time  steps?  One  approach  that  has  been  found  effective  in  the  past  is  to  partition  the  sensors 
into  k  groups  (Abrams  et  al.,  2004;  Deshpande  et  al.,  ’08;  Koushanfary  et  al.,  2006).  By  activating  a 
different  group  of  sensors  at  each  time  step  and  cyclicly  shifting  through  these  groups,  the  network 
lifetime  can  effectively  be  increased  by  a  factor  of  k.  In  the  traffic  network  application,  current 
studies  indicate  that  an  increase  by  a  factor  of  fc  =  4  would  be  required  to  make  sensor  deployment 
an  economically  feasible  option  (c./..  Section  6  for  more  details). 

Traditionally,  sensor  placement  and  sensor  scheduling  have  been  considered  separately  from  each 
other  -  one  first  decides  where  to  place  the  sensors,  and  then  when  to  activate  them.  In  this  paper, 
we  present  an  efficient  algorithm,  eSPASS  (for  effieient  Simultaneous  Plaeement  and  Scheduling 
of  Sensors),  that  jointly  optimizes  the  sensor  placement  and  the  sensor  schedule.  We  prove  that 
our  algorithm  provides  a  constant  factor  approximation  to  the  optimal  solution  of  this  NP-hard 
optimization  problem. 

Most  existing  approaches  to  sensor  placement  and  scheduling  associate  a  fixed  sensing  region 
with  every  sensor,  and  then  attempt  to  maximize  the  number  of  regions  covered  in  every  group  of 
sensors  (c./.,  Abrams  et  al.  (2004);  Deshpande  et  al.  (’08);  Hochbaum  and  Maas  (1985)).  In  complex 
applications  such  as  traffic  or  environmental  monitoring  however,  the  goal  of  sensor  placement  is 
a  prediction  problem,  where  one  intends  to  predict  the  sensed  phenomenon  at  the  locations  where 
no  sensors  are  placed.  Our  algorithm  applies  to  such  settings  where  the  sensing  quality  of  a  set  of 
sensors  is  measured,  e.g.,  in  the  improvement  of  prediction  accuracy  (more  formally,  our  algorithm 
applies  whenever  the  sensing  quality  function  satisfies  submodularity,  an  intuitive  diminishing  returns 
property) . 

In  contrast  to  most  existing  algorithms  that  optimize  scheduling  for  average  case  performance,  our 
approach  furthermore  provides  a  schedule  that  performs  uniformly  well  over  time,  hence  leading  to  a 
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well-balanced  performance  of  the  sensor  network.  For  security-critical  applications  such  as  outbreak 
detection,  such  balanced  performance  is  a  crucial  requirement  not  met  by  existing  algorithms.  In  fact, 
our  experimental  results  show  that  average-case  optimal  solutions  can  lead  to  arbitrarily  unbalanced 
performance,  but  optimizing  for  balanced  performance  (using  eSPASS)  typically  leads  to  good 
average-case  performance. 

Deploying  a  large  number  of  scheduled  sensors  has  the  additional  benefit  that  it  allows  trading 
off  power  and  accuracy.  The  deployed  network  might  have  several  modes  of  operation:  a  scheduled 
mode  of  operation,  where  only  a  small  fraction  of  sensors  is  turned  on,  and  a  “high  density”  mode 
where  all  (or  a  larger  fraction  of)  sensors  are  activated.  For  example,  in  traffic  monitoring,  once 
a  traffic  congestion  is  detected  (during  scheduled  mode),  the  high  density  mode  could  be  used  to 
accurately  identify  the  boundary  of  the  congestion.  We  show  how  our  algorithm  can  be  extended  to 
support  such  a  power-accuracy  tradeoff. 

We  present  extensive  empirical  studies  on  several  case  studies,  illustrating  the  versatility  of  our 
algorithm.  These  case  studies  include  sensing  tasks  such  as  traffic  and  environmental  monitoring  and 
placing  sensors  for  outbreak  detection  and  selecting  informative  weblogs  to  read  on  the  Internet.  Our 
results  show  that  simultaneously  placing  and  scheduling  results  in  drastically  improved  performance 
compared  to  the  setting  where  optimization  over  the  placement  and  the  scheduling  are  performed 
separately. 

In  summary,  our  main  contributions  are: 

•  We  study  the  problem  of  simultaneously  placing  and  scheduling  sensors  as  a  novel  optimization 
problem. 

•  We  develop  eSPASS,  an  efficient  approximation  algorithm  for  this  problem,  that  applies  to 
a  variety  of  realistic  sensing  quality  functions  (such  as  area  coverage,  variance  reduction,  out¬ 
break  detection,  etc.).  Our  algorithm  is  guaranteed  to  provide  a  near-optimal  solution,  that 
obtains  at  least  a  constant  fraction  of  the  optimal  sensing  quality.  eSPASS  furthermore  allows 
to  trade  off  power  consumption  and  accuracy. 

•  We  perform  several  extensive  case  studies  on  real  sensing  problems  in  traffic  and  environmental 
monitoring  as  well  as  outbreak  detection,  demonstrating  the  effectiveness  of  our  approach. 

2  Problem  Statement 

We  will  first  separately  introduce  the  sensor  placement  and  scheduling  problems,  and  then  formalize 
the  problem  of  simultaneously  placing  and  scheduling  sensors. 

2.1  Sensor  Placement 

In  sensor  placement,  we  are  given  a  finite  set  V  of  possible  locations  where  sensors  can  be  placed. 
Our  goal  is  to  select  a  small  subset  .4  C  V  of  locations  to  place  sensors  at,  that  maximizes  a  sensing 
quality  function  F{A).  There  are  several  different  notions  of  sensing  quality  that  we  might  want 
to  optimize,  each  depending  on  the  particular  sensing  task.  For  example,  we  can  associate  sensing 
regions  with  every  sensor,  and  F{A)  can  measure  the  total  area  covered  when  placing  sensors  at 
locations  A.  In  complex  applications  such  as  the  traffic  monitoring  problem,  we  are  interested  in 
optimizing  the  prediction  accuracy  when  obtaining  measurements  from  locations  A.  In  this  setting, 
we  can  model  the  state  of  the  world  (e.g.,  the  traffic  condition  at  different  locations)  using  a  collection 
of  random  variables  Ay,  one  variable  Xg  for  each  location  s  S  V.  We  can  then  use  a  probabilistic 
model  (such  as  a  Gaussian  Process  which  is  frequently  used  in  geostatistics,  c./.,  Cressie  (1991)) 
that  models  a  joint  probability  distribution  P(Av)  over  the  possible  locations.  Upon  acquiring 
measurements  A4  =  x_4  at  a  subset  of  locations  A,  we  can  then  predict  the  phenomenon  at  the 
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(a)  Optimized  sensor  placement 


(b)  Optimized  schedule  for  placement  (a) 


(c)  Simultaneous  placement  and  schedule 


(d)  Multicriterion  solution 


Figure  2:  In  the  stage-wise  approach,  sensors  are  first  deployed  (a),  and  the  deployed  sensors  are  then 
scheduled  (b,  sensors  assigned  to  the  same  time  slot  are  drawn  using  the  same  color  and  marker).  In 
the  simultaneous  approach,  we  jointly  optimize  over  placement  and  schedule  (c).  (d)  Multicriterion 
solution  to  Problem  (4)  (A  =  .25)  that  performs  well  both  in  scheduled  and  high-density  mode. 


unobserved  locations  using  the  conditional  distribution  P{Xv\a  \  Xa  —  x^).  We  can  then  use  the 
expected  mean  squared  error, 

Var(A’v  |  =  x^)  =  -b  ^  g  [{Xs  -  E[A’s  |  x^])^  |  x^] 

I  I  sGV 

to  quantify  the  uncertainty  in  this  prediction.  Since  we  do  not  know  the  values  x_4  before  placing  the 
sensors,  a  natural  choice  of  the  sensing  quality  function  F(A)  is  to  measure  the  expected  reduction 
in  variance  at  the  unobserved  locations, 

F{A)  =  Var(dfv)  -  J  P{^a)  Var(A’v  |  =  x^)dx_4. 

This  sensing  quality  function  has  been  found  useful  for  sensor  selection  (c./.,  Deshpande  et  al.  (2004); 
Krause  et  al.  (2008a))  and  experimental  design  (c./.,  Chaloner  and  Verdinelli  (1995)). 

It  can  be  shown  that  both  the  area  covered  and  the  variance  reduction  objective,  as  well  as  many 
other  notions  of  sensing  quality,  satisfy  the  following  intuitive  diminishing  returns  property  (Das 
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and  Kempe,  2008;  Krause  et  al.,  2007)^:  Adding  a  sensor  helps  more  if  we  have  placed  few  sensors 
so  far,  and  less  if  we  already  have  placed  lots  of  sensors.  This  intuition  can  be  formalized  using  the 
combinatorial  concept  of  submodularity:  A  set  function  F  is  called  submodular,  if  for  all  A  C  C  V 
and  s  G  V  \  ,8 

F{A  U  {s})  -  F{A)  >  F{B  U  {s})  -  F(8), 

i.e.,  adding  s  to  a  small  set  A  helps  more  than  adding  s  to  the  superset  B.  In  addition,  these  sensing 
quality  functions  are  monotonic:  For  dll  AFB  it  holds  that  F{A)  <  F{B),  i.e.,  adding  more  sensors 
can  only  improve  the  sensing  quality. 

Based  on  this  notion  of  a  monotonic,  submodular  sensing  quality  function,  the  sensor  placement 
problem  then  is 

m^F{A)  such  that  |A|  <  m, 

i.e.,  we  want  to  find  a  set  A  of  at  most  m  locations  to  place  sensors  maximizing  the  sensing  quality 
F. 


2.2  Sensor  Scheduling 

In  sensor  scheduling,  we  are  given  a  sensor  placement  (i.e.,  locations  A),  and  our  goal  is  to  assign  each 
sensor  s  €  A  one  of  k  time  slots.  This  assignment  partitions  the  set  A  into  disjoint  sets  A\,. . .  ,Ak, 
where  At  C  A  is  the  subset  of  sensors  that  have  been  assigned  slot  t.  A  round-robin  schedule  can 
then  be  applied  that  cycles  through  the  time  slots,  and  activates  sensors  At  at  time  t.  Since  each 
sensor  is  active  at  only  one  out  of  k  time  slots,  this  procedure  effectively  increases  the  lifetime  of  the 
network  by  a  factor  of  k.  How  can  we  quantify  the  value  of  a  schedule  A  =  (Ai, . . . ,  Afc)?  For  each 
group  At,  we  can  compute  the  sensing  quality  F"(At)^.  One  possibility  would  then  be  to  optimize 
for  the  average  performance  over  time. 


1  ^ 
max  -  > 
Ai,...,Ak  k 


F{At). 


However,  as  we  show  in  our  experiments,  if  we  optimize  for  the  average  case  performance,  it  can 
happen  that  a  few  of  the  time  slots  are  very  poorly  covered,  i.e.,  there  is  a  time  t  such  that  F"(At)  is 
very  low.  For  security-critical  applications,  this  can  be  problematic.  Instead,  we  can  also  optimize 
for  a  balanced  schedule, 

max  minA(At), 

Ai,...,Ak  ^ 

that  performs  uniformly  well  over  time. 

Note  that  the  above  formulation  of  the  scheduling  problem  allows  to  handle  settings  where 
each  sensor  can  be  active  at  r  >  1  timesteps.  In  this  setting,  we  simply  define  a  new  ground  set 
A'  =  A  X  {1, . . .  ,r}  where  the  pair  (s,i)  G  A'  refers  to  the  z-th  activation  of  sensor  s.  The  sensing 
quality  function  is  modified  as  F'(A')  =  F{{s  :  3i{s,i)  G  A'}). 


2.3  Simultaneous  placement  and  scheduling 

Both  sensor  placement  and  sensor  scheduling  have  been  studied  separately  from  each  other  in  the 
past.  One  approach  towards  placement  and  scheduling  would  be  to  first  use  an  algorithm  (such 
as  the  algorithm  proposed  by  Krause  et  al.  (2007))  to  find  a  sensor  placement  A,  and  then  use 
a  separate  algorithm  (such  as  the  mixed  integer  approach  of  Koushanfary  et  al.  (2006))  to  find  a 

^Variance  reduction  has  been  shown  to  be  submodular  for  Gaussian  distributions  under  certain  assumptions  about 
the  covariance  by  Das  and  Kempe  (2008). 

^Note  that  we  assume  the  same  sensing  quality  function  F  for  each  time  step.  This  assumption  has  been  made  in 
the  past  (c./.,  Koushanfary  et  al.  (2006);  Abrams  et  al.  (2004)),  and  is  reasonable  for  many  monitoring  tasks. 
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schedule  Ai, . . .  ,Ak-  We  call  this  approach  a  stage-wise  approach,  and  illustrate  it  in  Figures  2(a) 
and  2(b). 

Instead  of  separating  placement  and  scheduling,  we  can  simultaneously  optimize  for  the  placement 
and  the  schedule.  Suppose  we  have  resources  to  purchase  m  sensors,  and  we  would  like  to  extend 
the  network  lifetime  by  a  factor  of  k.  Our  goal  would  then  be  to  find  k  disjoint  sets  Ai, . . . ,  Ak  ^  V, 
such  that  together  these  sets  contain  at  most  m  locations,  i.e.,  lUAI  <  m.  We  call  this  problem 
the  SPASS  problem,  for  simultaneous  placement  and  scheduling  of  sensors.  Again,  we  can  consider 
the  average-case  performance, 

1  ^ 

max  -^F"(A()  s.t.  AiflAj  =0  if  iyf  j  and  I  U  At|  <  m  (1) 

-4.1  — 4fc  fC  ^ ^  i 

and  the  balanced  objective: 

max  min  F{At)  s.t.  Ai  n  A,-  =  0  if  z  yf  j  and  I  U  At|  <  m.  (2) 

By  performing  this  simultaneous  optimization,  we  can  obtain  very  different  solutions,  as  illustrated 
in  Figure  2(c).  In  Section  6,  we  will  show  that  this  simultaneous  approach  can  lead  to  drastically 
improved  performance  as  compared  to  the  traditional,  stage- wise  approach.  In  this  paper,  we  present 
eSPASS,  an  efficient  approximation  algorithm  with  strong  theoretical  guarantees  for  this  problem. 

The  placement  and  schedule  in  Figure  2(c)  has  the  property  that  the  sensors  selected  at  each 
time  step  share  very  similar  locations,  and  hence  perform  roughly  equally  well.  However,  if  activated 
all  at  the  same  time,  the  “high-density”  performance  F{Ai  U  •  •  •  U  Ak)  is  much  lower  than  that  of 
the  placement  in  Figure  2(a).  We  also  develop  an  algorithm,  McSPASS,  that  leads  to  placements 
which  perform  well  both  in  scheduled  and  in  high-density  mode.  Figure  2(d)  presents  the  solution 
obtained  for  the  mcSPASS  algorithm. 

Note  that  instead  of  fixing  the  number  of  time  slots,  we  could  also  specify  a  desired  accuracy 
constraint  Q  and  then  ask  for  the  maximum  lifetime  solution,  i.e.,  the  largest  number  k  of  time  slots 
such  that  a  solution  with  minimum  (or  average)  sensing  quality  Q  is  obtained.  Clearly,  an  algorithm 
that  solves  Problem  (2)  (or  Problem  (1))  could  be  used  to  solve  this  alternative  problem,  by  simply 
binary  searching  over  possible  values  for  k^. 

Further  note  that  it  is  possible  to  allow  each  sensor  to  be  active  at  r  >  1  timesteps  by  using  the 
modification  described  in  Section  2.2. 


3  A  naive  greedy  algorithm 

We  will  first  study  the  problem  of  optimizing  the  average  performance  over  time,  i.e..  Problem  (1),  for 
a  fixed  monotonic  submodular  sensing  quality  function  F.  Considering  the  fact  that  simultaneously 
placing  and  scheduling  is  a  strict  generalization  of  sensor  placement,  which  itself  is  NP-hard  (c./., 
Krause  et  al.  (2007)),  we  cannot  expect  to  efficiently  find  the  optimal  solution  to  Problem  (1)  in 
general. 

Instead,  we  will  use  the  following  intuitive  greedy  algorithm  that  we  call  GAPS  for  Greedy 
Average-case  Placement  and  Scheduling.  At  every  round,  GAPS  picks  a  time  slot  t  and  location  s 
which  increases  the  total  sensing  quality  the  most,  until  m  location/time-slot  pairs  have  been  picked. 
It  is  formalized  as  Algorithm  1. 

^However,  in  case  an  approximate  algorithm  is  used  such  as  the  eSPASS  algorithm  developed  in  this  paper,  its 
guarantees  are  not  necessarily  preserved. 
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Algorithm  GAPS  {F,  V,  k,  m) 

At  ^  %  for  all  t; 

for  i  =  1  to  m  do 

foreach  s  G  V  \  {Ai  U  •  •  •  U  Ak),  1  <  t  <  fc  do 
1  (5t,,  ^  F{At  U  {s})  -  P(A); 

end 

(G,s*)  ^  argmaxj  g  (5t,s; 

At-  ^  At-  U  {s*}; 

end 

Algorithm  1:  The  greedy  average-case  placement  and  scheduling  (GAPS)  algorithm. 


3.1  Theoretical  guarantee 

Perhaps  surprisingly,  we  can  show  that  this  simple  algorithm  provides  near-optimal  solutions  for 
Problem  (1).  In  fact,  it  generalizes  the  distributed  Set-k  Cover  algorithm  proposed  by  Abrams  et  al. 
(2004)  to  arbitrary  submodular  sensing  quality  functions  F,  and  to  the  setting  where  at  most  m 
sensors  can  be  selected  in  total. 

Theorem  3.1.  For  any  monotonic,  and  submodular  function  F,  GAPS  returns  a  solution  Ai, . .  ■ ,  Ak 

s.t. 

t  t 

GAPS  requires  at  most  O  {kmn)  evaluations  of  F. 

The  proofs  of  Lemma  4.2  and  all  other  results  are  given  in  the  Appendix.  The  key  observation  is 
that  Problem  (1)  is  an  instance  of  maximizing  a  submodular  function  subject  to  a  matroid  constraint 
{c.f,  the  Appendix  for  details).  A  fundamental  result  by  Fisher  et  al.  (1978)  then  proves  that  the 
greedy  algorithm  returns  a  solution  that  obtains  at  least  one  half  of  the  optimal  average-case  score. 
Matroids  for  sensor  scheduling  have  been  considered  before  by  Williams  et  al.  (2007). 

3.2  Greedy  can  lead  to  unbalanced  solutions 

If  a  sensor  placement  and  schedule  is  sought  that  performs  well  “on-average”  over  time,  GAPS 
performs  well.  However,  even  though  the  average  performance  over  time,  F{At),  is  high,  the 
performance  at  some  individual  timesteps  t'  can  be  very  poor,  and  hence  the  schedule  can  be  unfair. 
In  security-critical  applications,  where  high  performance  is  required  at  all  times,  this  behavior  can 
be  problematic.  In  such  settings,  we  might  be  interested  in  optimizing  the  balanced  performance 
over  time,  mint  F (At).  This  optimization  task  was  raised  as  an  open  problem  by  (Abrams  et  al., 
2004). 

A  first  idea  would  be  to  try  to  modify  the  GAPS  algorithm  to  directly  optimize  this  balanced 
performance,  i.e.,  replace  Line  1  in  Algorithm  1  by 

Sts^  minP(A^^‘’^^)  -  minF(Aj), 
j  j 

where  is  the  solution  obtained  by  adding  location  s  to  time  slot  At  in  solution  A  =  Ai  U 

•  •  •  U  Ak-  We  call  this  modified  algorithm  the  GBPS  algorithm  (for  Greedy  Balanced  Placement 
and  Scheduling).  Unfortunately,  both  GAPS  and  GBPS  can  perform  arbitrarily  badly.  Consider  a 
simple  scenario  with  three  locations,  V  =  {a,  b,  c},  and  the  monotonic  submodular  function  F{A)  = 
|A|.  We  want  to  partition  V  into  three  timesteps,  i.e.,  fc  =  3  and  m  =  3.  Here,  the  optimal  solution 
would  be  to  pick  A*  =  {a},  A2  =  {b}  and  A3  =  {c}.  However,  both  GAPS  and  GBPS  would  (ties 
broken  unfavorably)  pick  Ai  =  {a,b,c}  and  A2  =  A3  =  0,  obtaining  a  minimum  score  of  0. 
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(c)  Reallocation  move  (d)  Realloeated  elements 


Figure  3:  Illustration  of  our  eSPASS  algorithm.  The  algorithm  first  “guesses”  (binary  searches  for) 
the  optimal  value  c.  (a)  Then,  big  elements  s  where  T’({s})  >  /3c  are  allocated  to  separate  buckets, 
(b)  Next,  the  remaining  small  elements  are  allocated  to  empty  buckets  using  the  GAPS  algorithm. 
(c,d)  Finally,  elements  are  reallocated  until  all  buckets  are  satisfied. 


Unfortunately,  this  poor  performance  is  not  just  a  theoretical  example  -  in  Section  6  we  demon¬ 
strate  it  empirically  on  real  sensing  tasks. 

4  The  eSPASS  algorithm 

In  the  following,  we  will  develop  an  efficient  algorithm,  eSPASS  (for  efficient  Simultaneous  Place¬ 
ment  and  Scheduling  of  Sensors),  that,  as  we  will  show  in  Section  4.2,  is  guaranteed  to  provide 
a  near-optimal  solution  to  the  Problem  (2).  To  the  best  of  our  knowledge,  our  algorithm  is  the 
first  algorithm  with  theoretical  guarantees  for  this  general  problem,  hence  partly  resolving  the  open 
problem  described  by  Abrams  et  al.  (2004). 

4.1  Algorithm  overview 

We  start  with  an  outline  of  our  algorithm,  and  then  proceed  to  discuss  each  step  more  formally. 

Our  high-level  goal  will  be  to  reduce  the  problem  of  optimizing  the  balanced  objective  into 
a  sequence  of  modified  optimization  problems  involving  an  average-case  objective,  which  we  can 
approximately  solve  using  GAPS.  This  idea  is  based  on  the  following  intuition:  Consider  a  truncated 
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objective  function  Fc{A)  =  min{_F(yl),  c}.  The  key  observation^  is  that,  for  any  constant  c,  it  holds 
that 

1  * 

min  F{At)  ^  c  ^  Fc{At)  =  c, 

^  t=i 

i.e.,  the  minimum  score  is  greater  than  or  equal  to  c  if  and  only  if  the  average  truncated  score  is  c. 

Now  suppose  someone  tells  us  the  value  c*  attained  by  an  optimal  solution,  i.e.,  max_4  min^  F{At)  = 
c* .  By  the  above  observation  this  problem  is  equivalent  to  solving 

^  t=i 

It  can  be  shown  {c.f.,  Fujito  (2000))  that  for  c  >  0,  the  truncated  objective  function  F^  remains 
monotonic  and  submodular.  Hence,  Problem  (3)  is  an  instance  of  the  average-case  Problem  (1). 
Now  we  face  the  challenge  that  we  do  not  generally  know  the  optimal  value  c*.  We  could  use 
a  simple  binary  search  procedure  to  find  this  optimal  value.  Hence,  if  we  could  optimally  solve 
the  monotonic  submodular  average-case  Problem  (1),  we  would  obtain  an  optimal  solution  to  the 
balanced  Problem  (2). 

Unfortunately,  as  shown  in  Section  3.1,  solving  the  average-case  problem  is  NP-hard,  and,  using 
GAPS,  we  can  only  solve  it  approximately,  obtaining  a  solution  that  achieves  at  least  half  of  the 
optimal  value.  In  the  following,  we  will  show  how  we  can  turn  this  approximate  solution  for  the 
average-case  problem  into  a  near-optimal  solution  for  the  balanced  problem. 

Our  algorithm  will  maintain  one  “bucket”  At  QV  for  each  time  slot  t.  Since  our  goal  is  to  develop 
an  approximation  algorithm  achieving  at  least  a  fraction  /3  >  0  of  the  optimal  sensing  quality,  we 
need  to  allocate  m  elements  s  G  V  to  the  k  buckets  such  that  F{At)  >  (dc*  for  all  buckets  At-  Hereby, 

/3  is  a  constant  that  we  will  specify  later.  We  call  a  bucket  “satisfied”  if  F{At)  >  fdc* ,  “unsatisfied” 
otherwise.  Here  is  an  outline  of  our  eSPASS  algorithm.  Figure  3  presents  an  illustration. 

1.  “Guess”  the  optimal  value  c. 

2.  Call  an  element  s  G  V  “big”  if  Fc({s})  >  /3c  and  “small”  otherwise.  Put  each  big  element 
into  a  separate  bucket  {c.f.,  Figure  3(a)).  From  now  on,  we  ignore  those  satisfied  buckets,  and 
focus  on  the  unsatisfied  buckets. 

3.  Run  GAPS  to  optimize  Fj,  and  allocate  the  small  elements  to  the  unsatisfied  buckets  {c.f. 
Figure  3(b)). 

4.  Pick  a  “satisfied”  bucket  At  that  contains  sufficiently  many  elements,  and  reallocate  enough 
elements  to  an  “unsatisfied”  bucket  to  make  it  satisfied  {c.f.  Figures  3(c)  and  3(d)).  Repeat 
step  3  until  no  more  buckets  are  unsatisfied  or  no  more  reallocation  is  possible.  We  will  show 
that  this  reallocation  will  always  terminate. 

5.  If  all  buckets  are  satisfied,  return  to  step  1  with  a  more  optimistic  (higher)  “guess”  for  c.  If 
at  least  one  bucket  remains  unsatisfied,  return  to  step  1  with  a  more  pessimistic  (lower)  guess 
for  c. 

eSPASS  terminates  with  a  value  for  c  such  that  all  buckets  t  have  been  assigned  elements  At 
such  that  F{At)  >  /3c.  It  guarantees  that  upon  termination,  c  is  an  upper  bound  on  the  value  of  the 
optimal  solution,  hence  providing  a  /3  approximation  guarantee.  In  Section  4.2  we  will  show  that 
/3  =  g  suffices.  In  summary,  we  have  the  following  guarantee  about  eSPASS: 


^Krause  et  al.  (’08)  used  this  observation  to  develop  an  algorithm  for  robust  optimization  of  submodular  functions. 


Theorem  4.1.  For  any  monotonic,  submodular  function  F  and  constant  £  >  0,  eSPASS,  using 
GAPS  as  subroutine,  returns  a  solution  Ai, . . .  ,Ak  such  that 

min_F(At)  >  -  maxmin  FfAl)  —  £. 

t  K>  A'  t 

eSPASS  requires  at  most  O  ((1  +  log2  F{V)/e)kmn)  evaluations  of  F. 

Hereby,  e  is  an  tolerance  parameter  that  can  be  made  arbitrarily  small.  The  number  of  iterations 
increases  only  logarithmically  in  1/e. 


Algorithm  eSPASS  {F,  V,  k,  m,  e) 

Omin  0,  Omax  F(V);  /3^1/6; 
while  Cmax  -  Cmin  >  £  do 
£  (Cniax  “t”  Cmin)/2; 

1  B  ^  {s  €V  :  Acds})  >  f3c}; 
k'  < —  fc; 

2  foreach  s  G  B  do 

Ak'  ^  {s};  k'  ^  k'  -  1; 
ii  k'  =  0  then  Cmin  <—  c; 

Abest  ^  (Al ,  .  .  .  ,  Ak)  , 

continue  with  while  loop; 

end 

V  ^V\B-,m'  ^m-\B\-, 

3  Ai-k'  ^  GAPS{F^,  V,  k',  to'); 

4  ifEi^A)  <  k'c/2  then  Cmax  ^  c;  continue; 
else 

5  while  3i,j  <  k' :  Fc{Aj)  <  /3c,  Fc{Ai)  >  3/3c  do 

foreach  s  G  Ai  do 

Aj  ^  Aj  U  {s};  Ai  ^  Ai  \  {s}; 
if  Fc{Aj)  >  jdc  then  break; 
end 
end 

£min  £)  Abest  (Al,  ■  •  ■  ,  Ak) , 

end 

end 

Algorithm  2:  The  eSPASS  algorithm  for  simultaneously  placing  and  scheduling  sensors. 


4.2  Algorithm  details 

We  will  now  analyze  each  of  the  steps  of  eSPASS  in  detail.  The  pseudocode  is  given  in  Algorithm  2. 
Removing  big  elements.  The  main  challenge  when  applying  the  GAPS  algorithm  to  the 
truncated  Problem  (3)  is  exemplified  by  the  following  pathological  example.  Suppose  the  optimal 
value  is  c.  GAPS,  when  applied  to  the  truncated  function  Fc,  could  pick  k/2  elements  si, . . . ,  Sk/2^ 
with  A({si})  =  c  each.  While  this  solution  obtains  an  average-case  score  of  c/2  (one  half  of  optimal 
as  guaranteed  by  Theorem  3.1),  there  is  no  possibility  to  reallocate  these  k/2  elements  into  k  buckets, 
and  hence  some  buckets  will  remain  empty,  giving  a  balanced  score  of  0. 

To  avoid  this  pathological  case,  we  would  like  to  eliminate  such  elements  s  G  V  with  high 
individual  scores  F({s}),  to  make  sure  that  we  can  rearrange  the  solution  of  GAPS  to  obtain  high 
balanced  score.  Hence,  we  distinguish  two  kinds  of  elements:  Big  elements  s  with  F({s})  >  /3c, 
and  small  elements  s  with  ^({5})  <  [3c.  If  we  intend  to  obtain  a  /3  approximation  to  the  optimal 
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score  c,  we  realize  that  big  elements  have  high  enough  value  to  each  satisfy  an  individual  bucket. 
Let  B  be  the  set  of  big  elements  (this  set  is  determined  in  Line  1).  If  \B\  >  k,  we  already  have  a 
/3-approximate  solution:  Just  put  one  big  element  in  each  bucket.  If  \B\  <  k,  put  each  element  in  B 
in  a  separate  bucket  Ai, . . .  ,A\s\  {c-f-,  Line  2).  We  can  now  set  these  satisfied  buckets  aside,  and 
look  at  the  reduced  problem  instance  with  elements  V'  =  V  \ ^8,  m'  =  m  —  \B\  and  k'  =  k  —  \B\.  Our 
first  lemma  shows  that  if  the  original  problem  instance  (F,  V,  k,  m)  has  optimal  value  c,  the  reduced 
problem  instance  {F,V'  ,k'  ,m')  still  has  optimal  value  c. 

Lemma  4.2.  The  optimal  value  on  the  new  problem  instance  {F,V' ,k' ,m')  is  still  c. 

Hence,  without  loss  of  generality,  we  can  now  assume  that  for  all  s  G  V,  J^({s})  <  /3c. 

Solving  the  average-case  problem.  In  the  next  step  of  the  algorithm,  we  run  an  a-approximate 
algorithm  (such  as  GAPS  where  a  =  ^),  using  the  truncated  objective  Fc,  on  the  reduced  problem 
instance  containing  only  small  elements  (c./..  Line  3).  This  application  results  in  an  allocation 
Ai,  ■  ■  ■ ,  Ak'  of  elements  into  buckets.  If  Fc{At)  <  ack',  then  we  know  that  c  is  an  upper  bound 
to  the  optimal  solution,  and  it  is  safe  to  set  Cmax  to  c  (c./..  Line  4)  and  continue  with  the  binary 
search.  Otherwise,  we  have  a  solution  where  ^c{At)  >  ack' .  However,  as  argued  in  Section  3.2, 
this  a-approximate  solution  could  still  have  balanced  score  0,  if  all  the  elements  are  allocated  to  only 
the  first  ak'  buckets.  Hence,  we  need  to  reallocate  elements  from  satisfied  into  unsatisfied  buckets 
to  obtain  a  balanced  solution. 

Reallocation.  We  will  transfer  elements  from  satisfied  buckets  to  unsatisfied  buckets,  until  all 
buckets  are  satisfied.  Let  us  define  a  “reallocation  move”  as  follows  (c./..  Line  5).  Pick  a  bucket 
Ai  =  {ai, . . . ,  ai}  for  which  Fc{Ai)  >  3/3c  (we  will  guarantee  that  such  a  bucket  always  exists),  and 
a  bucket  Aj  that  is  not  satisfied,  i.e.,  Fc{Aj)  <  /3c.  Choose  £  such  that  Fc({ai, . . . ,  ae-i})  <  /3c  and 
Fc{{ai, . . . ,  ae})  >  /3c.  Let  A  =  {oi, . . . ,  ae}.  Note  that  A  is  not  empty  since  each  m  is  small  (i.e., 
Fc{{ai})  <  /3c).  We  reallocate  the  elements  A  by  removing  A  from  Ai  and  adding  A  to  Aj. 

Lemma  4.3.  It  holds  that 


Fc{Aj  U  A)  >  /3c,  and  Fc{Ai  \  A)  >  Fc{Ai)  —  2/3c. 

Hence,  removing  elements  A  does  not  decrease  the  value  of  Ai  by  more  than  2/3c,  and  thus  Ai 
remains  satisfied.  On  the  other  hand,  the  previously  unsatisfied  bucket  Aj  becomes  satisfied  by 
adding  the  elements  A.  We  want  to  make  sure  that  we  can  always  execute  our  reallocation  move, 
until  all  buckets  are  satisfied.  The  following  result  shows  that  if  we  choose  (3  =  ^,  this  will  always 
be  the  case: 

Lemma  4.4.  If  we  set  /3  =  ^,  then,  after  at  most  k  reallocation  moves,  all  buckets  will  be  satisfied, 
i.e.,  Fc{Ai)  >  /3c  for  all  i. 

Binary  search.  Since  the  optimal  value  c  is  generally  not  known,  we  have  to  search  for  it.  This  is 
done  using  a  simple  binary  search  strategy,  starting  with  the  interval  [0,  3^(12)]  which  is  guaranteed 
the  optimal  value  due  to  monotonicity.  At  every  step,  we  test  the  center  c  of  the  current  interval. 
If  all  buckets  can  be  filled  to  /3c,  then  the  truncation  threshold  c  can  be  increased.  If  the  algorithm 
for  maximizing  the  average-case  score  (such  as  GAPS)  does  not  return  a  solution  of  value  at  least 
ac,  then  that  implies  that  the  optimal  value  has  to  be  less  than  c,  and  the  truncation  threshold  is 
decreased  (c./..  Line  4).  If  F  takes  only  integral  values,  then  after  at  most  [log2  P(V)]  -I- 1  iterations, 
the  binary  search  terminates. 

4.3  Improving  the  bounds 

The  bound  in  Theorem  4.1  is  “offline”  -  we  can  state  it  independently  of  the  specified  problem 
instance.  While  guaranteeing  that  the  obtained  solutions  cannot  be  arbitrarily  bad,  the  constant 
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factor  6  bound  for  eSPASS  is  typically  rather  weak,  and  we  can  show  that  our  obtained  solutions 
are  typically  much  closer  to  the  optimal  value. 

We  can  do  this  by  computing  the  following  data-dependent  bounds  on  the  optimal  value.  Let  A'  = 
{A'l, . . .  ,A'i.)  be  a  candidate  solution  to  Problem  (2)  (e.g.,  obtained  using  the  eSPASS  algorithm 
or  any  other  algorithm).  For  every  1  <  £  <  k  and  s  G  V  let  Se^s  =  F{A'^  U  {s})  —  F{A'i)  be  the 
increment  in  function  value  when  adding  sensor  s  to  time  slot  £. 

Theorem  4.5.  The  optimal  value  c*  =  max^  mint  F{At)  is  bounded  by  the  solution  c  to  the  following 
linear  program: 


max  c  s.t. 

c  <  F{Ai)  +  ^  \i,s5i,s  for  all  i 
Xi^s  <  1  for  all  s  and  <  rn 

i  i,s 

Theorem  4.5  states  that  for  any  given  instance  of  the  SPASS  problem,  and  for  a  candidate 
solution  A  obtained  by  using  any  algorithm  (not  necessarily  using  eSPASS),  we  can  solve  a  linear 
program  to  efficiently  get  an  upper  bound  on  the  optimal  solution.  In  Section  6  we  will  show  that 
these  bounds  prove  that  our  solutions  obtained  using  eSPASS  are  often  much  closer  to  the  optimal 
solution  than  guaranteed  by  the  bound  of  Theorem  4. 1 . 

5  Trading  off  Power  and  Accuracy 

As  argued  in  Section  1,  deploying  a  larger  number  of  sensors  and  scheduling  them  has  the  advantage 
over  deploying  a  small  number  of  sensors  with  large  batteries  that  a  high  density  mode  can  be 
supported.  In  contrast  to  scheduled  mode,  where  the  sensors  are  activated  according  to  the  schedule, 
in  high-density  mode  all  sensors  are  active,  to  provide  higher  resolution  sensor  data  (e.g.,  to  localize 
the  boundary  of  a  traffic  congestion  in  our  running  example) .  For  a  fixed  solution  Ai , . . . ,  Ak  to  the 
SPASS  problem,  the  (balanced)  scheduled-mode  sensing  quality  is  miniP(Ai),  whereas  the  high- 
density  sensing  quality  is  F{Ai  U  •  •  •  U  Afc).  Note  that  optimizing  for  the  scheduled  sensing  quality 
does  not  necessarily  lead  to  good  high-density  sensing  quality.  Hence,  if  both  modes  of  operation 
should  be  supported,  then  we  should  simultaneously  optimize  for  both  performance  measures.  One 
such  approach  to  this  multicriterion  optimization  problem  is  to  define  the  scalarized  objective 

Fx{Ai, . . .  ,Ak)  =  AminF’(Ai)  +  (1  -  X)F{Ai  U  •  •  •  U  Ak), 

i 

and  then  solve  the  problem 

max  F\{Ai, . . .  ,Ak)  a.i.  Ai(^  Aj  =  tb  ii  i  ^  j,  \yj  At\ <m.  (4) 

Ai...Ak  t 


Note  that  if  A  =  1,  we  recover  the  SPASS  problem.  Furthermore,  as  A  ^  0,  the  high-density 
sensing  quality  F{Ai  U  •  •  •  U  Afc)  dominates,  and  the  chosen  solution  will  converge  to  the  stage- wise 
approach,  where  first  the  set  A  of  all  sensors  is  optimized,  and  then  this  placement  is  partitioned 
into  A  =  Ai  U  U  Ak-  Hence,  by  varying  A  between  1  and  0,  we  can  interpolate  between  the 
simultaneous  and  the  stage- wise  placement  and  scheduling. 

We  modify  eSPASS  to  approximately  solve  Problem  (4),  and  call  the  modified  algorithm  MC- 
SPASS  (for  multicriterion  Simultaneous  Placement  and  Scheduling  of  Sensors).  The  basic  strategy 
is  still  a  binary  search  procedure.  However,  instead  of  simply  picking  all  available  big  elements  (as 
done  by  eSPASS),  mcSPASS  will  also  guess  (search  for)  the  number  £  of  big  elements  used  in  the 
optimal  solution.  It  will  pick  these  big  elements  in  a  greedy  fashion,  resulting  in  a  set  Atig  C  V.  For 
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a  fixed  guess  of  c  and  £,  McSPASS  will  again  use  GAPS  as  a  subroutine.  However,  the  objective 
function  used  by  GAPS  will  be  modified  to  account  for  the  high-density  performance: 

G{Ai, . . . ,  Ak')  =  A  Fc{Ai)  -l-  (1  —  X)F(A  U  Abig), 

i 

where  A  =  Ai  U  . .  .Ak',  and  k'  =  k  —  This  modified  objective  function  combines  a  component 
(weighted  by  A)  that  measures  the  scheduled  performance,  as  well  as  a  component  (weighted  by 
1  —  A)  that  measures  the  improvement  in  high-density  performance,  taking  into  account  the  set  Abig 
of  big  elements  that  have  already  been  selected.  The  reallocation  procedure  remains  the  same  as 
in  eSPASS.  The  remaining  details  of  our  McSPASS  approach  are  presented  in  the  proof  to  the 
following  theorem,  which  can  be  found  in  the  Appendix. 

Theorem  5.1.  For  any  monotonic,  submodular  function  F  and  constants  £  >  0  and  0  <  A  <  1, 
McSPASS  will  efficiently  find  a  solution  Ai,  ■ .  ■ ,  Ak  such  that 

Fx{Ai,...,Ak)  >  ^maxFx{A[, . . .  ,Ak)  -  s. 

8  A' 

In  Section  6  we  will  see  that  we  can  use  this  extension  to  obtain  placements  and  schedules  that 
perform  well  both  in  scheduled  and  high-density  mode. 

6  Experiments 

In  our  experiments,  we  analyze  several  case  studies  on  different  data  sets.  Each  case  study  presents 
a  different  justification  of  the  SPASS  optimization  problem. 

6.1  Case  study  I:  Highway  monitoring 

The  California  highways  are  currently  monitored  by  over  10,000  traffic  sensors  based  on  older  tech¬ 
nologies.  As  these  loops  fail,  they  are  being  replaced  by  novel  wireless  sensor  networks  technologies, 
and  it  is  an  important  problem  to  identify  economic  deployment  strategies.  PeMS  (Berkeley)  is  a 
website  and  project  that  integrates,  cleanses  and  tracks  real  time  traffic  information  for  the  whole 
state,  computing  key  performance  indicators.  The  sensors  typically  report  speed,  flow  and  vehicle 
counts  every  30  seconds,  and  PeMS  aggregates  the  data  further  into  5  minute  blocks.  For  this  case 
study,  we  use  data  from  highway  1-880  South,  which  extends  for  35  miles  in  northern  California 
(Figure  4)  and  has  between  3  and  5  lanes.  This  highway  experiences  heavy  traffic,  and  accurate 
measurements  are  essential  for  proper  resource  management.  Measurement  variation  is  mainly  due 
to  congestion  and  events  such  as  accidents  and  road  closures.  There  are  88  measurement  sites  along 
the  highway,  on  average  every  2  miles,  which  comprise  357  sensors  covering  all  lanes.  We  use  speed 
information  from  lanes,  for  all  days  of  the  week  in  a  single  month,  excluding  weekends  and  holidays 
for  the  period  from  6AM  to  11AM,  which  is  the  time  when  the  highway  is  congested.  This  is  the 
most  difficult  time  for  making  predictions,  as  when  there  is  no  congestion,  even  a  free  flow  speed 
prediction  of  60  mph  is  accurate. 

The  number  and  locations  of  sensors  are  limited  by  costs  and  physical  deployment  constraints. 
Typically  at  each  location,  it  is  only  possible  to  place  one  sensor  at  each  lane.  Furthermore,  lane 
closures  for  sensor  installations  are  very  costly.  Given  these  constraints,  California  requires  that 
sensor  technologies  have  a  target  lifetime  of  10  years.  This  implies  that  most  wireless  sensor  solutions 
require  intelligent  scheduling  in  order  to  extend  the  lifetime  by  four  times,  since  most  sensor  network 
solutions  batteries  are  expected  to  last  2  to  3  years.  Including  more  batteries  in  a  single  sensor  is 
not  viable,  as  sensors  have  physical  constraints  to  avoid  disrupting  the  existing  pavement  structure 
and  keep  installation  costs  at  a  minimum. 
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Figure  4:  Placements  and  schedules  for  the  traffic  data,  (a)  Stage-wise  approach,  (b)  SPASS  solution. 


As  wireless  sensors  displace  existing  loop  technologies,  it  is  desirable  to  place  as  few  sensors  as 
possible,  without  trading  off  too  much  sensing  quality.  To  achieve  these  goals  in  a  principled  manner, 
historical  loop  data  from  the  current  deployment  should  be  used.  eSPASS  provides  a  solution  which 
can  balance  these  conflicting  requirements,  by  combining  scheduling  and  placement:  more  sensors 
are  placed  initially,  still  keeping  road  closures  at  a  minimum,  and  scheduling  is  used  to  extend  the 
lifetime  of  the  network,  keeping  sensing  quality  balanced.  In  this  section  we  explore  this  solution 
and  compare  eSPASS  to  other  simultaneous  placement  and  scheduling  solutions. 

Simultaneous  vs.  stage-wise  optimization.  In  our  first  experiment,  we  study  the  benefit  of 
simultaneously  placing  and  scheduling  sensors.  For  varying  numbers  m  of  sensors  and  k  of  time  slots, 
we  use  different  strategies  to  find  k  disjoint  sets  Ai, . . .  ,Ak,  where  Ai  is  the  sensors  active  at  time 
slot  i.  We  compare  the  simultaneous  placement  and  schedule  (optimized  using  eSPASS  and  GAPS) 
with  solutions  obtained  by  first  placing  sensors  at  a  fixed  set  of  locations,  and  then  scheduling  them. 
We  consider  both  optimized  and  random  sensor  placements,  followed  by  optimized  and  random 
scheduling,  amounting  to  four  stage-wise  strategies.  For  random  placements  and  schedules,  we 
report  the  mean  and  standard  error  over  20  random  trials. 

Figure  5(a)  presents  the  performance  of  the  five  strategies  when  optimizing  the  average-case 
performance,  for  a  fixed  number  of  m  =  50  sensors  and  a  number  of  time  slots  k  varying  from  1  to 
20.  GAPS  performs  best,  followed  by  the  stage-wise  optimized  placement  and  schedule  (OP/OS). 
Of  the  two  strategies  where  one  component  (either  the  placement  or  the  schedule)  is  randomized 
(OP/RS  and  RP/OS),  for  small  numbers  (<  3)  of  time  slots  OP/RS  performs  slightly  better,  and 
for  large  numbers  of  time  slots  (>  10),  RP/OS  performs  slightly  better.  The  completely  randomized 
solution  performs  significantly  worse. 

Figure  5(b)  presents  the  same  results  when  optimizing  the  balanced  criterion.  eSPASS  outper¬ 
forms  the  stage- wise  strategies  and  the  completely  randomized  strategy  RP/RS  performs  worst  as 
expected.  Interestingly,  for  the  balanced  criterion,  OP/RS  performs  drastically  worse  than  RP/OS 
for  fc  >  4  time  slots.  We  hypothesize  this  to  be  due  to  the  fact  that  a  poor  random  placement 
can  more  easily  be  compensated  for  by  using  a  good  schedule  than  vice  versa:  When  partitioning  a 
sensor  placement  of  50  sensors  randomly  into  a  large  number  of  time  slots,  it  is  fairly  likely  that  at 
least  one  of  the  timeslots  exhibits  poor  performance,  hence  leading  to  a  poor  balanced  score.  This 
insight  also  suggests  that  the  larger  the  intended  improvement  in  network  lifetime  (number  of  time 
slots),  the  more  important  it  is  to  optimize  for  a  balanced  schedule. 
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(a)  [T]  Sim.  vs.  seq.  (avg) 
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Figure  5:  Results  for  traffic  monitoring  [T].  (a,b)  compare  simultaneous  placement  and  scheduling 
to  stage-wise  strategies  on  (a)  average-case  and  (b)  balanced  performance  (m  =  50,  k  varies),  (c) 
compare  average-case  and  balanced  performance,  when  optimizing  for  average-case  (using  GAPS) 
and  balanced  (using  eSPASS)  performance,  (d)  “Online”  (data-dependent)  bounds  show  that  the 
eSPASS  solutions  are  closer  to  optimal  than  the  factor  6  “offline”  bound  from  Theorem  4.1  suggests. 


To  summarize  this  analysis,  we  see  that  simultaneous  placement  and  scheduling  drastically  out¬ 
performs  the  stage-wise  strategies.  For  example,  if  we  place  50  sensors  at  random,  and  then  use 
eSPASS  to  schedule  them  into  4  time  slots,  we  achieve  an  estimated  minimum  reduction  in  Mean 
Squared  error  by  58%.  If  we  first  optimize  the  placement  and  then  use  eSPASS  for  scheduling, 
we  can  achieve  the  same  amount  of  variance  reduction  by  scheduling  6  time  slots  (hence  obtaining 
a  50%  increase  in  network  lifetime).  If  instead  of  stage-wise  optimization  we  simultaneously  opti¬ 
mize  the  placement  and  the  schedule  using  eSPASS,  we  can  obtain  the  same  variance  reduction  by 
scheduling  8  time  slots,  hence  an  increase  in  network  lifetime  by  100%. 

Average  vs.  balanced  performance.  We  have  seen  that  simultaneously  placing  and  schedul¬ 
ing  can  drastically  outperform  stage-wise  strategies,  for  both  the  average-case  and  the  balanced 
objective.  But  which  of  the  objectives  should  we  use?  In  order  to  gain  insight  into  this  question, 
we  performed  the  following  experiment.  For  varying  k  and  m,  we  obtain  solutions  to  the  SPASS 
problem  using  both  the  eSPASS  and  the  GAPS  algorithm.  We  then  evaluate  the  respective  solu¬ 
tions  both  using  the  average-case  and  the  balanced  criterion.  Figure  5(c)  presents  the  results  of  this 
experiment  for  k  varying  from  1  to  10,  and  fixed  ratio  of  5  sensors  per  time  slot.  As  expected,  eS- 
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Figure  6:  (a)  Results  for  community  sensing  [C].  When  querying  each  car  only  once  each  week  (using 
the  eSPASS  schedule),  the  sensing  quality  is  only  23%  lower  than  when  querying  every  day.  (b-d) 
Example  placements  and  schedules  for  water  networks  [W]. 


PASS  outperforms  GAPS  with  respect  to  the  balanced  criterion,  and  GAPS  outperforms  eSPASS 
according  to  the  average-case  criterion.  However,  while  eSPASS  achieves  average-case  score  very 
close  to  the  solution  obtained  by  GAPS,  the  balanced  score  of  the  GAPS  solutions  are  far  worse 
than  those  obtained  by  eSPASS.  Hence,  optimizing  for  the  balanced  criterion  performs  well  for  the 
average  case,  but  not  vice  versa. 

Online  bounds.  In  order  to  see  how  close  the  solutions  obtained  by  eSPASS  are  to  the  optimal 
solution,  we  also  compute  the  bounds  from  Theorem  4.5.  Figure  5(d)  presents  the  bounds  on  the 
maximum  variance  reduction  achievable  when  placing  50  sensors  and  partitioning  them  into  an 
increasing  number  of  groups.  We  plot  both  the  factor  6  bound  due  to  Theorem  4.1,  as  well  as 
the  data-dependent  bound  due  to  Theorem  4.5.  We  can  see  that  the  data  dependent  bounds  are 
much  tighter.  For  example,  if  we  partition  the  sensors  into  2  groups,  our  solution  is  at  least  78%  of 
optimum,  for  5  groups  it  is  at  least  70%  of  optimum  (rather  than  the  17%  of  Theorem  4.1). 
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Figure  7:  (a,b)  Contamination  detection  in  water  networks  [W].  (a)  compares  simultaneous  and 
stage-wise  solutions,  (b)  power/accuracy  tradeoff  curve  with  strong  knee.  (c,d)  compares  eSPASS 
with  existing  solutions  on  synthetic  data  [S]. 


6.2  Case  study  II:  Community  Sensing 

While  the  static  deployment  of  sensors  has  become  an  important  means  for  monitoring  traffic  on 
highways  and  arterial  roads,  due  to  high  deployment  and  maintenance  cost  it  is  difficult  to  extend 
sensor  coverage  to  urban,  side-streets.  However,  in  order  to  optimize  road-network  utilization, 
accurate  estimates  of  side-street  conditions  is  essential. 

Instead  of  (or  in  addition  to)  statically  deploying  sensors,  a  promising  approach,  studied  by 
Krause  et  al.  (2008a),  would  be  to  utilize  cars  as  traffic  sensors:  An  increasing  number  of  vehicles 
nowadays  are  equipped  with  GPS  and  Personal  Navigation  Devices,  which  can  accurately  localize 
a  car  on  a  road  network.  Furthermore,  these  devices  are  becoming  connected  to  wireless  networks, 
using,  e.g.,  GPRS  or  Edge  connectivity,  through  which  they  could  report  their  location  and  speed. 
Hence,  in  principle,  it  is  possible  to  access  accurate  sensor  data  through  the  network  of  cars. 

However,  there  are  significant  challenges  when  dealing  with  such  a  network  of  non-centrally  owned 
sensors.  While  users  may  generally  consider  sharing  their  sensor  data,  they  have  reasonable  con¬ 
cerns  about  their  privacy.  Krause  et  al.  (2008a)  provide  methods  for  community  sensing^  describing 
strategies  for  selectively  querying  a  community  sensor  network  while  maintaining  preferences  about 
privacy.  They  demonstrated  how  the  selective  querying  of  such  a  community  sensor  network  can  be 
modeled  as  the  problem  of  optimizing  a  monotonic  submodular  sensing  quality  function  that  consid- 
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ers  demand  based  on  road  usage.  Preferences  about  privacy  map  to  constraints  in  the  optimization 
problem. 

One  basic  preference  that  needs  to  be  supported  is  the  preference  that  each  user  is  queried  at 
most  once  in  a  specified  time  interval  (e.g.,  queried  at  most  once  each  week  or  month).  Let  V  be  the 
set  of  users  subscribing  to  the  community  sensing  service.  In  order  to  query  each  user  at  most  one  in 
k  time  steps,  one  strategy  would  be  to  partition  the  users  into  k  sets  Ai, ,  Ak,  such  that  at  time 
step  t,  users  At  are  queried.  In  order  to  obtain  continuously  high  performance  of  the  monitoring 
service,  we  want  to  make  sure  that  the  performance  F{At)  is  maximized  simultaneously  over  all 
time  steps.  This  is  exactly  an  instance  of  the  SPASS  problem. 

In  order  to  evaluate  the  performance  of  the  eSPASS  algorithm,  we  used  the  experimental  setup  of 
Krause  et  al.  (2008a),  using  real  traffic  data  from  534  detector  loops  deployed  underneath  highways, 
GPS  traces  from  85  volunteer  drivers  and  demand  data  based  on  directions  generated  in  response 
to  requests  to  a  traffic  prediction  and  route  planning  prototype  named  ClearFlow,  developed  at 
Microsoft  Research®.  Details  about  the  data  sets  are  described  by  Krause  et  al.  (2008a).  Based  on 
this  experimental  setup,  we  compare  the  performance  of  the  eSPASS  and  GAPS  algorithms  (latter 
of  which  was  described  as  a  candidate  scheduling  approach  by  Krause  et  al.  (2008a)).  Using  each 
algorithm,  we  partition  the  users  into  7  respective  31  sets  (i.e.,  querying  each  user  at  most  once 
each  week  respective  month,  both  of  which  are  possible  options  for  privacy  preferences).  We  then 
evaluate  the  performance  based  on  the  worst  case  prediction  error  over  all  test  time  steps.  Figure  6(a) 
presents  the  results  of  this  experiment.  We  can  see  that  the  eSPASS  solutions  outperform  the  GAPS 
solutions.  When  partitioning  into  31  sets,  the  worst  prediction  performance  of  eSPASS  is  more  than 
twice  as  good  than  the  worst  prediction  performance  of  GAPS.  Most  importantly,  this  experiment 
shows  that,  using  eSPASS  for  scheduling,  one  can  obtain  a  very  high  balanced  performance,  even 
when  querying  each  individual  car  only  very  infrequently.  For  example,  when  querying  each  car 
only  once  each  week,  the  balanced  sensing  quality  is  only  23%  lower  than  that  obtained  by  the 
privacy-intrusive  continuous  (daily)  querying.  Even  if  querying  only  once  each  month  (i.e.,  a  factor 
31  more  infrequently),  the  balanced  performance  is  only  reduced  by  approximately  a  factor  of  2. 
These  results  indicate  that,  using  eSPASS,  even  stringent  preferences  about  privacy  can  be  met 
without  losing  much  prediction  accuracy. 

6.3  Case  study  III:  Contamination  detection 

Gonsider  a  city  water  distribution  network,  delivering  water  to  households  via  a  system  of  pipes, 
pumps  and  junctions.  Accidental  or  malicious  intrusions  can  cause  contaminants  to  spread  over  the 
network,  and  we  want  to  select  a  few  locations  (pipe  junctions)  to  install  sensors,  in  order  to  detect 
these  contaminations  as  quickly  as  possible.  In  August  2006,  the  Battle  of  Water  Sensor  Networks 
(BWSN)  (et  al.,  2008)  was  organized  as  an  international  challenge  to  find  the  best  sensor  placements 
for  a  real  (but  anonymized)  metropolitan  water  distribution  network,  consisting  of  12,527  nodes.  In 
this  challenge,  a  set  of  intrusion  scenarios  is  specified,  and  for  each  scenario  a  realistic  simulator 
provided  by  the  EPA  (Rossman,  1999)  is  used  to  simulate  the  spread  of  the  contaminant  for  a  48 
hour  period.  An  intrusion  is  considered  detected  when  one  selected  node  shows  positive  contaminant 
concentration. 

The  goal  of  BWSN  was  to  minimize  impact  measures,  such  as  the  expected  population  affected, 
which  is  calculated  using  a  realistic  disease  model.  Krause  et  al.  (2008b)  showed  that  the  function 
F{A)  which  measures  the  expected  population  protected  by  placing  sensors  at  location  A  is  a 
monotonic  submodular  function.  Water  quality  sondes  can  operate  for  a  fairly  long  amount  of 
time  on  battery  power.  For  example,  the  YSI  6600  Sonde  can  sample  15  water  quality  parameters 
every  15  minutes  for  75  days.  However,  for  the  long-term  feasibility  it  is  desirable  to  considerably 
improve  this  battery  lifetime  by  sensor  scheduling.  On  the  other  hand,  high  sampling  rates  are 

®The  ClearFlow  research  system,  available  only  to  users  within  Microsoft  Corporation,  was  the  prototype  for  the 
Clearflow  context-sensitive  routing  service  now  available  publicly  for  North  American  cities  at  http://maps.live.com. 
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desirable  to  ensure  rapid  response  to  possible  contaminations.  For  a  security-critical  sensing  task 
such  as  protecting  drinking  water  from  contamination,  it  is  important  to  obtain  balanced,  uniformly 
good  detection  performance  over  time.  In  addition,  deployment  and  maintenance  cost  restrict  the 
number  of  sensors  that  can  be  deployed.  Hence,  the  problem  of  deploying  battery  powered  sensors 
for  drinking  water  quality  monitoring  is  another  natural  instance  of  the  SPASS  problem. 

We  reproduce  the  experimental  setup  detailed  in  (Krause  et  ah,  2008b).  However,  instead  of 
only  optimizing  for  the  sensor  placement,  we  simultaneously  optimize  for  placement  and  schedule 
using  the  eSPASS  algorithm.  Figure  7(a)  compares  eSPASS  with  the  stage-wise  approaches.  For 
each  algorithm,  we  report  the  population  protected  by  placing  sensors,  normalized  by  the  maximum 
protection  achievable  when  placing  sensors  at  every  node  in  the  network. 

Simultaneous  vs.  stage-wise  optimization.  eSPASS  obtains  drastically  improved  performance 
when  compared  to  the  stage-wise  approaches.  For  example,  when  scheduling  3  time  slots,  in  order 
to  obtain  85%  protection,  eSPASS  requires  18  sensors.  The  fully  optimized  stage-wise  approach 
(OP/OS)  requires  twice  the  number  of  sensors.  When  placing  36  sensors,  the  stage- wise  approach 
leaves  3  times  more  population  unprotected  as  compared  to  the  simultaneous  eSPASS  solution  with 
the  same  number  of  sensors.  eSPASS  solved  this  large  scale  optimization  task  (n  =  12,  527,  k  =  3, 
m  =  30)  in  26  minutes  using  our  MATLAB  implementation. 

Trading  off  power  aud  accuracy.  We  also  applied  our  modified  eSPASS  algorithm  in  order  to 
trade  off  scheduled  mode  and  high  density  mode  performance.  For  a  fixed  number  of  m  =  30  sensors 
and  fc  =  3  time  slots,  we  solve  Problem  (4)  for  values  of  A  varying  from  0  to  1.  For  each  value  of  A,  we 
obtain  a  different  solution,  and  plot  the  normalized  expected  population  protected  (higher  is  better) 
both  in  scheduled-  and  in  high-density  mode  in  Figure  7(b).  We  can  see  that  this  trade-off  curve 
exhibits  a  prominent  knee,  where  solutions  are  obtained  that  perform  nearly  optimally  with  respect 
to  both  criteria.  Figures  6(b),  6(c)  and  6(d)  show  the  placements  and  schedules  obtained  for  A  =  1 
(i.e.,  ignoring  the  high-density  sensing  quality),  A  =  0  (ignoring  the  schedule,  effectively  performing 
a  stage-wise  approach)  and  a  value  A  =  0.25  (from  the  knee  in  the  trade-off  curve)  respectively. 
Note  how  the  solution  for  A  =  1  clusters  the  sensors  closely  together,  obtaining  three  very  similar 
placements  Ali,Al2,Al3  for  each  time  slot  (similar  as  in  Figure  2).  The  solution  for  A  =  0  spreads 
out  the  sensors  more,  having  to  leave,  e.g.,  the  Western  part  of  the  network  uncovered  in  the  time 
slot  indicated  by  the  green  triangle.  The  multicriterion  solution  (A  =  0.25)  is  a  compromise  between 
the  former  two  solutions:  The  sensors  are  still  clustered  together,  but  also  spread  out  more  -  the 
Western  part  of  the  network  can  be  covered  in  this  solution. 

6.4  Case  study  IV:  Multiple  weblog  coverage 

The  fourth  case  study  is  in  a  very  different  application  domain,  an  online  information  network 
rather  than  a  physically  connected  sensor  network.  In  Leskovec  et  al.  (2007),  Leskovec  et  al.  studied 
the  problem  of  selecting  informative  weblogs  to  read  on  the  Internet.  Our  approach  is  based  on 
the  intuitive  notion  of  an  information  cascade:  A  blogger  writes  a  posting,  and,  after  some  time, 
other  blogs  link  to  it.  An  information  cascade  is  a  directed  acyclic  graph  of  vertices  (each  vertex 
corresponds  to  a  posting  at  some  blog),  where  edges  are  annotated  by  the  time  difference  between  the 
postings.  Based  on  this  notion  of  an  information  cascade,  we  would  like  to  select  blogs,  that  detect 
big  cascades  (containing  many  nodes)  as  early  as  possible  (i.e.,  we  want  to  learn  about  an  important 
event  before  most  other  readers).  In  Leskovec  et  al.  (2007)  it  is  shown  how  one  can  formalize  this 
intuition  using  a  monotonic  submodular  function  F  that  measures  the  informativeness  of  a  subset 
A  of  all  blogs  V.  Optimizing  the  submodular  function  F  leads  to  a  small  set  A  of  blogs  that 
“covers”  most  cascades.  Since  blogs  from  different  topics  (such  as  politics,  religion,  tech  news,  etc.) 
participate  in  different  cascades,  the  solution  set  A  will  typically  contain  very  diverse  blogs.  Similar 
blogs  that  overlap  significantly  suffer  from  diminishing  returns. 

While  such  a  notion  of  coverage  is  intuitive,  sometimes  multiple  coverage  is  desired:  We  might  be 
interested  in  selecting  “multiple  representatives”  from  each  topic,  whereby  we  allow  some  (limited) 
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Figure  8:  Results  on  temperature  data  from  Intel  Research  Berkeley  [B].  (a,b)  compares  eSPASS 
with  existing  solutions,  (c)  compares  running  time,  (d)  compares  average-case  and  balanced  perfor¬ 
mance. 


amount  overlap  among  the  covered  cascades.  For  example,  there  might  be  political  blogs  that 
participate  in  the  same  cascades  (discussions),  and  hence  have  a  lot  of  overlap,  but  it  is  nevertheless 
desirable  to  select  multiple  blogs  as  each  of  them  presents  their  own  perspective. 

We  can  formulate  this  requirement  as  an  instance  of  the  SPASS  problem:  We  would  like  to 
select  multiple,  disjoint  sets  of  blogs,  that  each  provide  a  large  amount  of  information. 

We  reproduce  the  experimental  setup  of  Leskovec  et  al.  (2007),  that  is  based  on  45,000  blogs,  as 
well  as  one  million  posts  during  2006.  This  data  set  contains  16,000  cascades  in  which  at  least  10 
blogs  participate.  We  use  the  population  affected  objective  function  described  in  detail  in  Leskovec 
et  al.  (2007).  Based  on  this  experimental  setup,  we  apply  eSPASS  to  partition  an  increasing  number 
of  blogs  into  three  groups  (to  allow  up  to  three-fold  coverage).  We  optimize  the  problem  using  the 
eSPASS  algorithm.  As  comparison,  we  first  select  the  set  of  blogs  (using  the  approach  described  in 
Leskovec  et  al.  (2007)),  and  then  use  the  eSPASS  algorithm  to  partition  the  selected  set  into  three 
groups  {k  =  3).  We  also  compare  against  the  other  stage-wise  approaches  as  done  in  Section  6.1. 

Figure  9  presents  the  results  of  this  experiment.  It  shows  that  blog  selections  using  the  eSPASS 
algorithm  obtain  10%  better  performance  than  the  stage-wise  approach.  Further  note  that  eSPASS 
is  able  to  solve  the  large  problem  instance  with  n  =  45,000  within  26  minutes. 
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Figure  9:  Multiple  coverage  for  blogs 

6.5  Comparison  with  existing  techniques 

We  also  compare  eSPASS  with  several  existing  algorithms.  Since  the  existing  algorithms  apply  to 
the  scheduling  problem  only,  we  call  eSPASS  with  m  =  |V|  (i.e.,  allow  it  to  select  all  sensors). 

Set  covering.  Most  existing  algorithms  for  sensor  scheduling  assume  that  sensors  are  associated 
with  a  fixed  sensing  region  that  can  be  perfectly  observed  by  the  sensor  (c./.,  Abrams  et  al.  (2004); 
Deshpande  et  al.  (’08)).  In  this  setting,  we  associate  with  each  location  s  S  M  a  set  TZg  C  V  of 
locations  that  can  be  monitored  by  the  sensor,  and  define  the  sensing  quality  F{A)  =  \  Usg.A 
be  the  total  area  covered  by  all  sensors.  Since  set  coverage  is  an  example  of  a  monotonic  submodular 
function,  we  can  use  eSPASS  to  optimize  it. 

We  compare  eSPASS  to  the  greedy  approach  by  Abrams  et  al.  (2004),  as  well  as  the  the  approach 
by  Deshpande  et  al.  (’08)  that  relies  on  solving  a  semidefinite  program  (SDP).  We  use  the  synthetic 
experimental  setup  defined  by  Deshpande  et  al.  (’08)  to  compare  the  approaches.  A  set  of  n  sensors 
is  used  to  cover  M  regions.  Each  sensor  s  is  associated  with  a  set  TZg  of  regions  it  covers.  The 
objective  is  to  divide  the  n  sensors  into  k  groups  (buckets),  such  that  the  minimum  or  the  average 
number  regions  covered  by  each  group  is  maximized. 

For  the  SDP  by  Deshpande  et  al.  (’08),  we  solve  the  SDP  using  SeDuMi  to  get  a  distribution 
over  possible  schedules,  and  then  pick  the  best  solution  out  of  100  random  samples  drawn  from  this 
distribution.  For  the  random  assignment  approach  (RandlOO)  of  Abrams  et  al.  (2004),  we  sample  100 
random  schedules  and  pick  the  best  one.  In  addition,  we  run  the  GAPS  and  the  eSPASS  algorithms. 
We  apply  those  four  algorithms  to  50  random  set  cover  instances  as  defined  by  Deshpande  et  al. 
(’08):  for  each  sensor,  a  uniform  random  integer  r  between  3  and  5  is  chosen,  and  then  the  first  r 
regions  from  a  random  permutation  of  the  set  of  M  regions  is  assigned  to  that  sensor.  The  sensor 
network  size  is  n  =  20,  the  number  of  desired  groups  fc  =  5  and  the  number  of  regions  is  M  =  50. 

Figure  7(c)  presents  the  average  performance  of  the  four  approaches.  In  this  setting,  the  SDP 
performs  best,  closely  followed  by  GAPS  and  eSPASS.  Figure  7(d)  presents  the  balanced  per¬ 
formance  of  the  four  approaches.  Here,  eSPASS  significantly  outperforms  both  the  SDP  and  the 
GAPS  solution. 

Building  monitoring.  As  argued  in  the  introduction,  for  complex  spatial  monitoring  problems, 
the  sensing  region  assumption  is  unrealistic,  and  we  would  rather  like  to  optimize  prediction  accuracy 
directly.  The  approach  by  Koushanfary  et  al.  (2006)  is  designed  to  schedule  sensors  under  constraints 
on  the  prediction  accuracy.  Their  approach,  given  a  required  prediction  accuracy,  constructs  a 
prediction  graph  that  encodes  which  sensors  can  predict  which  other  sensors.  They  then  solve 
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a  domatic  partitioning  problem,  selecting  a  maximal  number  of  disjoint  subsets  that  can  predict 
all  other  sensors  with  the  desired  accuracy.  In  order  to  determine  the  domatic  partitioning,  their 
algorithm  relies  on  the  solution  of  a  Mixed  Integer  Program  (MIP) .  However,  solving  MIPs  is  NP- 
hard  in  general,  and  unfortunately,  we  were  not  able  to  scale  their  approach  to  the  traffic  data 
application.  Instead,  we  use  data  from  46  temperature  sensors  deployed  at  Intel  Research,  Berkeley 
(c./.,  Deshpande  et  al.  (2004)). 

On  this  smaller  data  set,  we  first  apply  the  MIP  for  domatic  partitioning,  with  a  specified 
accuracy  constraint.  The  MIP  was  very  sensitive  with  respect  to  this  accuracy  constraint.  For  just 
slightly  too  small  values  of  £,  the  MIP  returned  a  trivial  solution  consisting  of  only  a  single  set. 
For  slightly  too  large  values,  the  MIP  had  to  consider  partitions  into  a  large  number  of  possible 
time  slots,  increasing  the  size  of  the  MIP  such  that  the  solver  ran  out  of  memory.  Requiring  that 
sensors  can  predict  each  other  with  a  Root  Mean  Squared  (RMS)  error  of  1.25  Kelvin  leads  to  a 
selection  of  m  =  19  sensors,  partitioned  into  fc  =  3  time  slots.  Using  this  setting  for  m  and  k, 
we  run  the  GAPS  and  eSPASS  algorithms,  which  happen  to  return  the  same  solution  for  this 
example.  In  order  to  compare  these  solutions  with  the  SDP  and  random  selection  from  the  previous 
section,  we  apply  them  to  the  prediction  graph  induced  by  the  required  prediction  accuracy.  We  first 
randomly  select  19  locations,  and  then  partition  them  into  3  groups  using  the  SDP  and  RandlOO 
approach,  respectively.  As  a  baseline,  we  randomly  select  3  groups  totaling  19  sensors  (Rand).  For 
these  randomized  techniques,  we  report  the  distribution  over  20  trials.  All  approaches  are  evaluated 
based  on  the  variance  reduction  objective  function. 

Figure  8(a)  presents  the  result  for  optimizing  the  average  variance  reduction,  and  Figure  8(b) 
for  the  minimum  variance  reduction.  In  both  settings,  GAPS  and  eSPASS  perform  best,  obtaining 
23%  less  remaining  maximum  variance  when  compared  to  the  MIP  solution  of  Koushanfary  et  al. 
(2006).  Furthermore,  using  YalMIP  in  Matlab,  solving  the  MIP  requires  95  seconds,  as  compared  to 
4  seconds  for  the  SDP  and  3.8  seconds  for  eSPASS  (Figure  8(c)).  Even  though  the  MIP  returns  an 
optimum  solution  for  the  domatic  partition  of  the  prediction  graph,  eSPASS  performs  better  since 
it  uses  the  fact  that  the  combination  of  multiple  sensors  can  lead  to  better  prediction  accuracy  than 
only  using  single  sensors  for  prediction.  Even  the  best  out  of  20  random  trials  for  the  SDP  performs 
worse  than  the  MIP,  due  to  the  approximate  nature  of  the  algorithm  and  the  random  selection  of 
the  initial  19  sensors.  The  RandlOO  approach  does  performs  only  slightly  (not  significantly)  worse 
than  the  SDP  based  approach. 

7  Related  Work 

In  the  context  of  wireless  sensor  networks,  where  sensor  nodes  have  limited  battery  and  can  hence 
only  enable  a  small  number  of  measurements,  optimally  placing  and  scheduling  sensors  is  of  key 
importance. 

Sensor  Placement.  Many  approaches  for  optimizing  sensor  placements  assume  that  sensors  have 
a  fixed  region  (Hochbaum  and  Maas,  1985;  Gonzalez-Banos  and  Latombe,  2001;  Bai  et  ah,  2006). 
These  regions  are  usually  convex  or  even  circular.  Furthermore,  it  is  assumed  that  everything  within 
this  region  can  be  perfectly  observed,  and  everything  outside  cannot  be  measured  by  the  sensors. 
For  complex  applications  such  as  traffic  monitoring  however,  such  assumptions  are  unrealistic,  and 
the  direct  optimization  of  prediction  accuracy  is  desired.  The  problem  of  selecting  observations  for 
monitoring  spatial  phenomena  has  been  investigated  extensively  in  geostatistics  (c./.,  Cressie  (1991) 
for  an  overview),  and  more  generally  (Bayesian)  experimental  design  (c./.,  Chaloner  and  Verdinelli 
(1995)).  Submodularity  has  been  used  to  analyze  algorithms  for  placing  a  fixed  set  of  sensors  (Krause 
et  ah,  2007).  These  approaches  however  only  consider  the  sensor  placement  problem,  and  not  the 
scheduling  aspect. 

Sensor  Scheduling.  The  problem  of  deciding  when  to  selectively  turn  on  sensors  in  order  to 
conserve  power  was  first  discussed  by  Slijepcevic  and  Potkonjak  (2001)  and  Zhao  et  al.  (2002). 
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Typically,  it  is  assumed  that  sensors  are  associated  with  a  fixed  sensing  region,  and  a  spatial  domain 
needs  to  be  covered  by  the  regions  associated  with  the  selected  sensors.  Abrams  et  al.  (2004)  presents 
an  efficient  approximation  algorithm  with  theoretical  guarantees  for  this  problem.  Deshpande  et  al. 
(’08)  presents  an  approach  for  this  problem  based  on  semidefinite  programming  (SDP),  handling 
more  general  constraints  and  providing  tighter  approximations.  They  also  provide  a  randomized 
rounding  based  approach  for  scheduling  under  the  balanced  objective  (which  they  call  min-coverage 
(time)).  However,  in  contrast  to  eSPASS  (when  specialized  to  scheduling)  their  algorithm  requires 
to  relax  the  constraint  that  each  sensor  location  can  only  be  selected  once.  Also,  their  guarantee  only 
holds  with  high  probability,  whereas  eSPASS  is  deterministic.  The  approaches  described  above  do 
not  apply  to  the  problem  of  optimizing  sensor  schedules  for  more  complex  sensing  quality  functions 
such  as,  e.g.,  the  increase  in  prediction  accuracy  and  other  sensing  quality  functions  considered 
in  this  paper.  To  address  these  shortcomings,  Koushanfary  et  al.  (2006)  developed  an  approach 
for  sensor  scheduling  that  guarantees  a  specified  prediction  accuracy  based  on  a  regression  model. 
However,  their  approach  relies  on  the  solution  of  a  Mixed  Integer  Program,  which  is  intractable  in 
general.  Zhao  et  al.  (2002)  proposed  heuristics  for  selectively  querying  nodes  in  a  sensor  network  in 
order  to  reduce  the  entropy  of  the  prediction.  Unlike  the  algorithms  presented  in  this  paper,  their 
approaches  do  not  have  any  performance  guarantees. 

Submodular  optimization.  The  problem  of  maximizing  a  submodular  function  subject  to  a 
matroid  constraint,  of  which  Problem  (1)  is  an  instance,  has  been  studied  by  Fisher  et  al.  (1978), 
who  proved  that  the  greedy  algorithm  gives  a  factor  2  approximation.  Recently,  Vondrak  (2008) 
showed  that  a  more  complex  algorithm  achieves  a  (1-1/e)  approximation  to  this  problem.  Note 
that  this  algorithm  could  be  applied  to  the  Problem  (1)  instead  of  GAPS.  Furthermore  note  that 
using  this  algorithm  as  a  subroutine,  the  analysis  of  eSPASS  can  be  improved  to  give  a  ^ 

guarantee.  Comparing  this  algorithm  is  an  interesting  direction  for  future  work.  A  related  version  of 
optimization  Problem  (2),  where  for  each  time  step  t  a  different  function  Ft  is  used  has  been  studied  in 
the  context  of  combinatorial  allocation  problems.  Ponnuswami  and  Khot  (2007)  present  an  algorithm 
that  guarantees  a  l/(2fc  —  1)  approximation.  For  the  special  case  where  the  objective  functions 
Ft  are  additive  (modular),  Asadpour  and  Saberi  (2007)  developed  an  algorithm  that  guarantees 

an  improved  H  approximation.  Both  algorithms  however  only  apply  for  the  scheduling 

setting  (i.e.,  they  require  that  m  =  |V|).  Furthermore,  note  that  the  approximation  performance  of 
these  algorithms  very  quickly  decreases  with  k,  in  contrast  to  our  eSPASS  approach  that  provides  an 
approximation  guarantee  that  is  independent  of  the  number  of  time  steps.  Krause  et  al.  (’08)  consider 
the  problem  of  robust  maximization  of  submodular  functions:  Given  a  collection  of  submodular 
functions,  Fi, . . . ,  Fm,  they  want  to  find  a  set  |A|  <  k  that  maximizes  miniFi(A).  While  this 
problem  appears  related  to  the  SPASS  problem,  where  we  want  to  maximize  min^  F{Ai),  the  solution 
techniques  and  results  are  very  different.  Firstly,  there  is  a  strong  conceptual  difference:  In  robust 
submodular  optimization,  a  single  set  A  is  sought  that  maximizes  multiple  functions  Fi, . . . ,  Fm, 
whereas  in  SPASS,  a  collection  of  sets  Ai, . . .  ,Ak  is  sought  that  each  perform  well  with  respect 
to  a  single  function  F.  Hence,  the  two  problem  formulations  address  very  different  optimization 
tasks.  Second,  while  both  algorithms  exploit  the  fact  that  truncation  preserves  submodularity, 
each  contains  unique  algorithmic  elements.  Lastly,  the  performance  guarantees  vary  drastically: 
the  robust  submodular  optimization  problem  does  not  admit  any  approximation  (and  requires  the 
relaxation  of  the  constraint  that  |A|  <  k),  whereas  for  the  SPASS  problem,  eSPASS  obtains  a 
constant-factor  6  approximation. 


8  Conclusions 

When  deploying  sensor  networks  for  monitoring  tasks,  both  placing  and  scheduling  the  sensors 
are  of  key  importance,  in  order  to  ensure  informative  measurements  and  long  deployment  lifetime. 
Traditionally,  the  problems  of  sensor  placement  and  scheduling  have  been  considered  separately 
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from  each  other.  In  this  paper,  we  presented  an  efficient  algorithm,  eSPASS,  that  simultaneously 
optimizes  the  sensor  placement  and  the  schedule.  We  considered  both  the  setting  where  the  average- 
case  performance  over  time  is  optimized,  as  well  as  the  balanced  setting,  where  uniformly  good 
performance  is  required.  Such  balanced  performance  is  crucial  for  security-critical  applications  such 
as  contamination  detection.  Our  results  indicate  that  optimizing  for  balanced  performance  often 
yields  good  average-case  performance,  but  not  necessarily  vice  versa.  We  proved  that  our  eSPASS 
algorithm  provides  a  constant  factor  6  approximation  to  the  optimal  balanced  solution.  To  the  best 
of  our  knowledge,  eSPASS  is  the  first  algorithm  that  provides  strong  guarantees  for  this  problem, 
partly  resolving  an  open  problem  raised  by  Abrams  et  al.  (2004) .  Furthermore,  our  algorithm  applies 
to  any  setting  where  the  sensing  quality  function  is  submodular,  which  allows  to  address  complex 
sensing  tasks  where  one  intends  to  optimize  prediction  accuracy  or  optimizes  detection  performance. 

We  also  considered  complex  sensor  placement  scenarios,  where  the  deployed  sensor  network 
must  be  able  to  function  well  both  in  a  scheduled,  low  power  mode,  but  also  in  a  high  accuracy 
mode,  where  all  sensors  are  activated  simultaneously.  We  developed  an  algorithm,  McSPASS,  that 
directly  optimizes  this  power-accuracy  tradeoff.  Our  results  show  that  McSPASS  yields  solutions 
which  perform  near-optimally  with  respect  to  both  the  scheduled  and  the  high-density  performance. 

We  extensively  evaluated  our  approach  on  several  real-world  sensing  case  studies,  including  traffic 
and  building  monitoring  as  well  as  contamination  detection  in  metropolitan  area  drinking  water 
networks.  When  applied  to  the  simpler  special  case  of  sensor  scheduling  (i.e.,  ignoring  the  placement 
aspect),  eSPASS  outperforms  existing  sensor  scheduling  algorithms  on  standard  data  sets.  For 
the  more  complex,  general  case,  our  algorithm  performs  provably  near-optimal  (as  demonstrated 
by  tight,  data-dependent  bounds).  Our  results  show  that,  for  fixed  deployment  budget,  drastic 
improvements  in  sensor  network  lifetime  can  be  achieved  by  simultaneously  optimizing  the  placement 
and  the  schedule,  as  compared  to  the  traditional,  stage-wise  approach.  For  example,  for  traffic 
prediction,  eSPASS  achieves  a  33%  improvement  in  network  lifetime  compared  to  the  setting  where 
placement  and  scheduled  are  optimized  separately,  and  a  100%  improvement  when  compared  to  the 
traditional  setting  where  sensors  are  first  randomly  deployed  and  then  optimally  scheduled. 

We  believe  that  the  results  presented  in  this  paper  present  an  important  step  towards  under¬ 
standing  the  deployment  and  maintenance  of  real  world  sensor  networks. 
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A  Proofs 

Proof  of  Theorem  3.1.  Define  a  new  ground  set  =  V  x  {1, . . . ,  fc},  and  a  new  function 

k 

F\A')  =  Y.F{{s:{s,t)eA'}). 

t=i 


F'  is  monotonic  and  submodular.  Let 

T  =  {A!  C  V'  :  \A!\  <  m  a  {$s,  i  ^  j  :  (s,  i)  G  A'  A  (s,  j)  G  A')}, 

i.e.,  T  is  the  collection  of  subsets  A'  C  V  that  do  not  contain  two  pairs  (s,  i)  and  (s,  j)  of  elements  for 
which  i  yf  j.  It  can  be  shown  that  I  form  independent  sets  of  a  matroid  {c.f,  Fisher  et  al.  (1978)). 
Note  that  there  is  a  one-to-one  correspondence  between  sets  A'  and  feasible  solutions  Ai,. . .  ,Ak  to 
the  SPASS  Problem  (1),  and  furthermore,  the  corresponding  solutions  have  the  same  value.  Hence, 
the  SPASS  problem  is  equivalent  to  solving 

A*  =  argmaxF'(A'). 

XI' ei 

As  Fisher  et  al.  (1978)  proved,  the  greedy  algorithm  GAPS  is  guaranteed  to  obtain  a  solution  that 
has  at  least  1/2  of  the  optimal  value.  □ 

Proof  of  Lemma  4-2.  Consider  an  optimal  allocation  Ti, . . .  ,%n.  Let  Bopt  be  the  set  Bopt  =  {i  ■ 
%  contains  a  big  element}.  Throw  away  all  buckets  (and  elements)  Bopt-  Now,  in  order  to  achieve 
score  c,  the  optimal  solution  has  to  fill  m  —  \Bopt  \  buckets  with  small  elements  (even  from  a  reduced 
set  of  small  elements,  those  not  thrown  away)  and  still  achieve  score  c  on  each  of  those  buckets. 
This  solution  is  in  fact  an  optimal  solution  achieving  score  c  on  the  new  problem  instance  (since  we 
will  use  at  least  as  many  big  elements  and  throw  away  at  least  as  many  buckets).  □ 

Proof  of  Lemma  4-3.  Suppose  Ai  is  a  bucket  for  which  Fc{Ai)  >  3/3c.  Now,  Ai  =  {ai,...,a;}, 
and  Fc{{ai})  <  Pc.  Choose  £  such  that  Fc({ai, . . . , a^-i})  <  Pc  and  Fc{{ai, . . .  ,ai})  >  Pc.  Let 
A  =  {oi, . .  .,ae}. 

Due  to  monotonicity  Fc{Aj  U  A)  >  F'c(A)  >  Pc.  It  remains  to  show  that  Fc{Ai  \  A)  >  Fc{Ai)  — 
2Pc.  Suppose  that  Fc{Ai  \A)<  Fc{Ai)  —  2/3c.  Let  B  =  Ai  \  A.  Then 

Fc{B  U  A)  -  Fc{B)  >  2Pc. 
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But 


Fc{A)  <  Fc{{ai, Of-i})  +  Fc{{ai})  <  2/3c, 
due  to  submodularity  of  F^,  and  the  fact  that  ai  is  a  small  element.  Hence 

F,{BUA)  -  F,{B)  >  F,{A)  -  F,(0), 

i.e.,  adding  A  to  helps  more  than  adding  A  to  the  empty  set,  contradicting  submodularity  of 

F,.  □ 


Proof  of  Lemma  4-4-  To  simplify  notation,  w.l.o.g.  let  us  assume  that  c  =  1.  Since  the  optimal 
balanced  performance  for  Fc  =  Fi  is  1,  the  optimal  average-case  performance  for  Fi  is  1  as  well. 
The  GAPS  algorithm  obtains  an  allocation  A  that  is  a  fraction  a  of  optimal.  Hence,  it  holds  that 
E^Fi(A)  >  ak.  We  call  Fi{Ai)  the  “mass”  of  the  allocation  A. 

How  many  unsatisfied  buckets  can  there  maximally  be?  Let  7  denote  the  fraction  of  unsatisfied 
buckets.  We  know  that 

fcy/?  -I-  A:(l  —  7)  >  ak, 

since  the  maximal  7  is  achieved  if  all  the  satisfied  buckets  are  completely  full  (containing  mass 
A:(l  — 7)),  and  the  unsatisfied  buckets  are  as  full  as  possible  without  being  satisfied  (hence  containing 
mass  less  than  k^fi).  Hence  it  follows  that 


7  < 


1  —  a 


Now  consider  the  mass  R  distributed  over  the  satisfied  buckets.  We  know  that 

a(l  —  j3)  —  —  a) 


R>  ak  —  "fkfi  >  k- 

and  the  worst  case  is  assumed  under  equality. 
The  first  reallocation  move  is  possible  if 

R 


1-/3 


^(1-7) 


>3/3, 


since,  if  the  average  remaining  mass  over  all  (1  —  7)fc  satisfied  buckets  is  3/3,  then  there  must  be  at 
least  one  bucket  to  which  the  move  can  be  applied.  Since  each  reallocation  move  reduces  the  mass  R 
by  at  most  2/3  (as  proved  by  Lemma  4.3),  and  since  we  need  yfc  moves  to  fill  all  unsatisfied  buckets, 
it  suffices  to  require  that 


R  —  27^/3 
k{l  -  7) 


>  3/3 


—  27^/3  >  3/3A:  —  3/37/0 
>  3/3/c  —  jd'-fk 


Hence,  a  sufficient  condition  for  /3  such  that  enough  moves  can  be  performed  to  fill  all  unsatisfied 
buckets  is 


a(l  -  /3)  -  /3(1 
1-/3 


a) 


>  3/3-/3 


1  —  a 
iFTp 


<=3P^  +  (—3  —  a)/3  -l-  a  >  0 


<=(3  <  a/3, 


by  solving  the  quadratic  equation  for  fi  and  ignoring  the  infeasible  solutions  /3  >  1.  Now,  since  /3  is 
going  to  be  our  approximation  factor,  we  want  to  maximize  /3  subject  to  the  above  constraint,  and 
hence  choose  fi  =  a /3.  □ 
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Proof  of  Theorem  4-P  The  proof  immediately  follows  from  the  analysis  in  Section  4.2.  For  the 
running  time,  notice  that,  in  each  binary  search  iteration,  the  greedy  algorithm  requires  at  most 
kmn  function  evaluations,  and  the  reallocation  step  requires  at  most  <  kmn  evaluations.  The 
binary  search  terminates  after  0(1  +  log2  F"(V))  iterations,  assuming  integrality  of  F.  □ 

Proof  of  Theorem  4-5.  Let  Ai  be  the  candidate  solution  for  time  slot  i,  and  Bi  =  {6i,  an 

optimal  solution  for  time  slot  i.  Due  to  monotonicity  and  submodularity,  it  holds  that 

rii 

F{A^)  <  F{A^  U  B,)  <  F{A^)  +  ^ 

Hence,  the  optimal  value  of  Problem  (2)  is  upper  bounded  by  the  optimal  solution  to  the  following 
integer  program: 


max  c  s.t. 

c  <  F(At)  +  ^  Xt,sSi,s  for  all  i 

S 

<  1  for  all  s  and  <  w  and  Xi^s  G  {0, 1}, 

i  i,s 


since  any  integer  solution  Xi^s  corresponds  to  a  possible  feasible  partition  B[,...,B'i..  The  linear 
program  in  Theorem  4.5  is  the  linear  programming  relaxation  to  the  above  integer  program.  □ 


Algorithm  mcGAPS  {F,  B,  V,  k,  m,  c.  A) 

At  ^  %  for  all  t;  A  ^  0; 
for  i  =  1  to  m  do 

foreach  sGV\AI,  l<t<A:do 

^t,s  ^  XFc{At  U  {s})  +  (1  —  A)A(Al  U  S  U  {s}); 
end 

^  argmaxj  g  Sty, 

At-  ^  At-  U  {s*};  A  ^  A  U  {s*}; 

end 


Algorithm  3:  The  greedy  average-case  placement  and  scheduling  (mcGAPS)  algorithm. 


Proof  Sketch  of  Theorem  5.1.  We  will  modify  eSPASS  in  the  following  way.  The  modified  algo¬ 
rithm,  mcSPASS  (see  the  pseudo  code  in  Algorithm  4),  will  “guess”  (binary  search  for)  the  value  c* 
attained  by  the  optimal  solution  A*.  It  will  then  guess  (search  for)  the  number  £  of  large  elements 
used  in  the  optimal  solution,  where  we  redefine  “large”  as  A({s})  >  For  such  a  guess,  mcSPASS 
will  first  greedily  select  the  £  large  elements  (according  to  F),  giving  a  set  Ag-  It  will  then  set  these 
large  elements  aside,  and  continue  on  the  small  elements.  Define  the  function 

G{Ai, . . . ,  A)  =  (1  -  A)(i^(A  U  A)  -  T’(-4g))  +  V  min{F(A),  c} 

i 

where  A  =  Ai  U  •  •  •  U  Ak.  mcSPASS  will  greedily  maximize  G  on  the  partition  matroid  (similarly 
to  using  GAPS).  Suppose  c*  is  the  optimal  value  c*  =  F\{A*).  Then  the  greedy  procedure  will  find 
a  solution  A'  such  that  G(A')  -f  (1  —  X)F(Ag)  >  |c*  (since  greedy  selection  of  big  elements  followed 
by  greedy  selection  of  small  elements  amounts  to  the  “local”  greedy  optimization  over  a  partition 
matroid  as  analyzed  by  Fisher  et  al.  (1978)). 
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Algorithm  McSPASS  (F,  V,  k,  m,  e,  A) 

Cmin  ^  0;  Cniax  ^  F(V);  /3^1/8; 

while  Cmax  -  Cmin  >  £  do 

for  £  =  0  to  k  do 

C  (Cmax  “t“  Cniin)/25 

1  S  ^  {s  G  V  :  i^c({s})  >  /3c}; 
if  |,B|  <  £  then  break; 

Abig  ^  0; 

for  i  =  1  to  £  do 

Abtg  ^  argmax^^g\_4^^^  F{Abig  U  {s}); 

Ak—i-\-l  ^ 

end 

k'  ^k-t, 

if  /c'  =  0  then  Abest,i  ^  (A,  ■  •  • ,  A); 

continue; 

V'  ^  V  \  Abig]  m'  ^  m-  £; 

2  Ai-k'  ^mcGAPS  {F,Abig,V',k',m',c,X); 

3  ifEA(A)  <  k'c/2  then  Cmax  ^  c;  continue; 
else 

4  while  3i,j  <  k' :  Fc{Aj)  <  /3c,  Fc(A)  >  3/3c  do 

foreach  s  G  Ai  do 

Aj  <  Aj  U  {5};  Ai  <  Ai  \  l-^}; 
if  Fc{Aj)  >  /3c  then  break; 
end 
end 

Abest^£  ^  {Ai ,  .  .  .  ,  Af^')  , 

end 

end 

£*  ^  Fx{Abest,i)', 

Cmin  C,  Abest  ^  Abest^l*  ^ 

end 

Algorithm  4:  The  McSPASS  algorithm  for  simultaneously  optimizing  scheduled  and  high- 
density  performance. 


Now,  at  least  one  of  (1  —  X)F{Ag  U  A')  >  c*/8,  or  Ei3^(A)  —  since  G{A')  +  (1  — 
X)F{Ag)  =  (1  —  X)F{Ag  U  A')  +  Xj:^  Ei  former  case  we  do  not  need  to  reallocate  and 

set  Ar  =  A' ■  In  latter  case,  we  use  the  reallocation  procedure,  and  arrive  at  a  solution  Ar  where 
all  buckets  are  satisfied  (since  A'  contains  only  small  elements),  i.e.,  min^  =  c*/8. 

Now,  Ar  U  Ag  is  a  feasible  solution  to  the  multicriterion  SPASS  problem,  with 

h{AR  U  Ag)  =  (1  -  A)F(A)  +  G{Ar) 

=  (1  -  X)F{Ag  U  Ar)  +  Amin F{AR,i)  >  c78. 

i 

□ 
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